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Dual-sourcing inventory systems, in which one supplier is faster (i.e. express) and more costly, while the 
other is slower (i.e. regular) and cheaper, arise natnrally in many real-world supply chains. These systems 
are notoriously difficult to optimize due to the complex structure of the optimal solution and the curse of 
dimensionality, having resisted solution for over 40 years. Recently, so-called Tailored Base-Surge (TBS) 
policies have been proposed as a heuristic for the dual-sourcing problem. Under such a policy, a constant 
order is placed at the regular source in each period, while the order placed at the express source follows a 
simple order-up-to rule. Numerical experiments by several authors have suggested that such policies perform 
well as the lead time difference between the two sources grows large, which is exactly the setting in which 
the curse of dimensionality leads to the problem becoming intractable. However, providing a theoretical 
foundation for this phenomenon has remained a major open problem. 

In this paper, we provide such a theoretical foundation by proving that a simple TBS policy is indeed 
asymptotically optimal as the lead time of the regular source grows large, with the lead time of the express 
source held fixed. Our main proof technique combines novel convexity and lower-bounding arguments, an 
explicit implementation of the vanishing discount factor approach to analyzing infinite-horizon Markov deci¬ 
sion processes, and ideas from the theory of random walks and queues, significantly extending the method¬ 
ology and applicability of a novel framework for analyzing inventory models with large lead times recently 
introduced by Goldberg and co-authors in the context of lost-sales models with positive lead times. 

Key words: inventory, dual-sourcing. Tailored Base-Surge policy (TBS), lead time, asymptotic optimality, 
convexity. 
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1 . Introduction 

A common practice in th e man agement of global supply chains is dual-sourcing (cf. 


Rao. Scheller-Wolf and Tavud (j200CllR . Under a dual-sourcing strategy, the companies usually pur¬ 


chase their materials from a regular supplier at a lower cost, but they are also able to obtain 
materials from an expedited supplier at a higher cost under emergency circumstances. For exam¬ 
ple, in the summer of 2003, Amazon used Fed Ex to deliver th e new Harry Potter more promptly 


and m ai ntained regular shipping 


(1200811 1 ■ 


via U PS (cf. 


Kellehei] (1200311 . 


Veeraraghavan and Scheller-Wolf 


Allon and Van MieghemI (j2010ll describes an example of a $10 billion high-tech U.S. com¬ 


pany that has two suppliers, one in Mexico and one in China. The one in Mexico has shorter lead 
time but higher per-unit ordering cost; the one in China has longer lead time (5 to 10 times longer) 
but lower per-unit ordering cost. The company takes advantage of the dual-sourcing strategy to 
meet the demand more responsively (from Mexico) as well as less expensively (from China). 

Although dual-sourcing is attractive, and very relevant to practice, optimizing a dual-sourcing 
inventory system is notoriously challenging. Such inventory systems have been studied now for 
over forty years, but the structure of the optimal policy remains poorly understood, with the 
exception of when the system is consecutive, i.e., the lead time difference between the two sources 


is exactly one. More spe c ifical 


models include 


RarankinI ( 1961 1. 


y, the ea r 


lest studie s of periodic review dual-sourcing inventory 


Daniell ( 1963 1. and 


NeutsI ([l96^, which showed that base-stock 


(also known as order- 
and one respectively. 


up-to) policies are optimal when the lead times of the two sources are zero 


Fukudal (1196411 e xtended the result to general lea d time settings as long as 


the lead time difference remains one. 


Whittmore and SaundersI (jl977fl showed that the optimal 


policy is no longer a simple base-stock policy when the lead time difference is beyond one and the 
structure of the optimal policy can be quite complex. Furthermore, it is well known that a dual- 
sourcing inventory system can be regarded as a generalization of a lost-sales inventory system (cf. 


SheoDuri. Janakiraman and Seshadri 


(1201011 1. Indeed, the intractability of both the dual-sourcing 


and lost-sales inventory models has a common source - as the lead time grows, the state-space of the 
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natural dynamic programming (DP) formulation grows exponentially, renderin g such techniques 


impractical. Th 

IS issue is 

typica 

(1958 


Morton 

(1 

969I, 

Zipkin 

Xin and Goldberg 

(2015 

for a 


l y refer red to as the “curse of dimensio nality” (cf. 


(j2008ll l. and we refer the reader to 


Kar 


Goldberg et al 


in and Scar! 


(1201,^ and 


There is a vast literature investigating periodic review dual-sour cing inv e ntory models as well 


as their variants, and w e refer the interest e d reader to the survey of 


the more recent works of 


tom 


Feng 


Huggins and QlsenI (1201 


Gong. Chao and Zhena (j2014ll 


et al 


( 200 ^ 


Fox. Metters and Semple 


Angelas and Ozerl (j2015ll 


inner 


fenndi. 


200311. as well as e.g. 


Chen. Xue and Yang 


Boute and Van MieghemI (j2015ll . 


Song and Zinkinl ((20091), and the references therein. 


As an exact solution seems out of reach, the operations research and management com¬ 
munities haveinstead investigated certain structural properties of the optimal policy (cf. 


Hua et al 


cies. 


(I 2 OI 4 III. and exerted consid e rable effort towards constructing various heuristic poli- 


Veeraraghavan and Scheffer- Wola (j2008ll proposed the family of dual index (DI) poli¬ 


cies, which have two base-stock levels, one for the regular source and one for the express 
source, and “orders up” to bring appropriate notion s of inventory position up to these levels. 


Scheffer-Wolf. Veeraraghavan and van HoutumI (j2008ll analyzed the closely related class of single 


index (SI) policies, for which the relevant notions of i nventory position are different. Both families 


of policies seem to perform well in numerical studies. 


SheoDuri. Janakiraman and Seshadril (|201Cll l 


considered two generalized classes of policies: one with an order-up-to structure for the express 
source, and one with an order-up-to structure for the regular source. Their numerical experiment 
showed that such policies can ou tperform DI policies. In the presence of production capacity costs. 


Boute and Van MieghemI (|2015l l studied dual-sourcing smoothing policies, under which the order 


quantities from both sources in each period are convex combinations of observed past demands. 
They analyzed such polices under normally distributed demand, and their numerical results showed 


that these policies performed better for higher capacity costs and longer lead time differences 
(between the two sources). 
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A simple and natural policy that is implemented in practice, which will be the subject of our 
own investi gations, is the so-called Tailore d Base-Surge (TBS) policy. It was first proposed and 


analyzed in 


cies had been studied previously (cf. 


Allon and Van MieghemI (1201011. where we note t h at clo s ely related standing order po ll 


Rosenshine and Obed (jl976ll . 


Janssen and De Kok 


199011 1. 


Under such a TBS policy, a constant order is placed at the regular source in each period to meet 


a base level of demand, while the orders placed at the 
to manage demand surges. We refer to Mini-Case 6 in 


express source 


bllow an order-up-to rule 


Van MieghemI (j2008[l for more about the 


motivation and background of TBS policies. Note that dual-sourcing inventory systems in which 
a constant-order policy is implemented for the regular source are essentially equivalent to single- 
sou rcing inventory systems with c onstant returns, which hay e been investigated in the literature 


(c 


Fleischmann and Kuik 


Allon and Van Mieghem 


(2003), 


DeCroix. Song and ZipkinI (1200511 1. 


20inl) analyzed TBS policies in a continuous review model, and their 


focus was t o find the best TBS policy. Numeri c al res ults in 


(1201 ill and 


Klosterhalfen. Kiesmiiller and Minnei 


Rossi. Riipkema and van der VorstI (|2012tl showed that TBS policies are comparable to 


DI policies, and outperform DI policies for some problem instances. 


Allon a.nd Van MieghemI ( 2010 1 


conjectured th at this policy performs more effective 


sources grows. 


V as t 


le lead time difference between the two 


■lanakiraman. Seshadri and Sheopuril ()2015h (henceforth denoted JSS) analyzed a 


periodic review model and studied the performance of the TBS policy. They provided an explicit 
bound on the performance of TBS policies compared to the optimal one when the demand had 
a specific structure, and provided numerical experiments suggesting that the performance of the 
TBS policy improves as the lead time difference grows large. 

However, to date there is no theoretical justification for the good behavior of TBS policies as 
the lead time difference grows large, and giving a solid theoretical foundation to this observed 
phenomena remains a major open question. We note that until recently, a similar state of affairs 
existed regarding the good performance of constant-order policies as the lead time grows large 
in single-source lost-sales inventory models. However, using tools from applied probab il ity, q ueue- 


ing theory, and convexity, this phenomena was recently explained in 


Goldberg et al 


( 2015 1 and 
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Xin and Goldberd (|2015f ). in which it was proven that a simple constant-order policy is asymptot¬ 
ically optimal in this setting as the lead time of the single source grows large. The intuition here 
is that as the lead time grows large, so much randomness is introduced into the system between 
when an order is placed and when that order is received, that it is essentially impossible for any 
algorithm to meaningfully use the state information to make significantly better decisions. Thus a 


policy which ignores the state information (i. e. constant-order p o 


an optimal policy. We note that the results of 


icyl 


performs nearly as well as 


Xin and Goldberd (120151) further demonstrate that 


the optimality gap of the constant-order policy actually shrinks exponentially fast to zero as the 
lead time grows large, and provide explicit and effective bounds even for moderate-to-small lead 
times. 

1.1. Our contributions 

In this paper, we resolve this open question by proving that, when the lead time of the express 
source is held fixed, a simple TBS policy is asymptotically optimal as the lead time of the reg¬ 
ular source grows large. Our results provide a solid theoret ical foundation for the conjectures 


and numerical experiments of 


Allon and Van MieghemI (j2010l) and JSS. Interestingly, the simple 


TBS policy performs nearly optimally exactly when standard DP-based methodologies become 
intractable due to the aforementioned “curse of dimensionality”. Furthermore, as the “best” TBS 
policy can be computed by solving a convex program that does not depend on the lead time of 
the regular source (cf. JSS), our results lead directly to very efficient algorithms (with complexity 
independent of the lead time of the regular source) with asymptotically optimal performance guar¬ 
antees. We also explicitly bound the optimality gap of the TBS policy for any fixed lead time (of 
the regular source), and prove that this decays inverse-polynomially in the lead time of the regu¬ 
lar source. Perhaps most importantly, sinc e many companies are already implementing such TBS 
policies (cf. lAllon and Van MieghemI (j2010l )). our results provide strong theoretical support for the 


widespread use of TBS policies in practice. Our main proof technique combines novel convexity and 


lower-bounding arguments, an explicit implementation of the vanishing discount factor approach to 
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analyzing infinite-horizon Markov decision processes (MDP), and ideas from the theory of random 


mode 


s with large lead times recently introduced in 


Goldbere et al. 

(2015 

) and 

Xin and Goldberg 


((20151) in the context of lost-sales models with positive lead times. Indeed, in the present work we 
relate the performance of an optimal policy t o a certain dynamic optim ization problem by applying 


the conditional Jensen’s inequality, while in 


Xin and Goldberd (j2015fl the relevant optimal policy 


could be bounded by a static optimization problem after applying the (non-conditional) Jensen’s 
inequality. The inherently dynamic nature of the resulting bounds introduce several additional 
difficulties not encountered previously, and which we address in the present work. 

1.2. Outline of paper 

The rest of the paper is organized as follows. We formally define the dual-sourcing problem in 
Section [21 and describe the TBS policy in Section [2T1 We state our main result in Section [2T21 and 
prove our main result in Section O We summarize our main contributions and propose directions 
for future research in Section [H We also include a technical appendix in Section [5l 

2. Model description, problem statement and assumptions 

In this section, we formally define our dual-sourcing inventor y problem, closely following the def¬ 


initions given in 


SheoDuri. Janakiraman and Seshadril ((201011 . Let {Dt}te{-oo,oo),{^t}te{-oo,oo) be 


mutually independent sequences of nonnegative independent and identically distributed (i.i.d.) 
demand realizations, distributed as the non-negative random variable (r.v.) D, which we assume 
to have finite mean, and (to rule out certain trivial degenerate cases) to have strictly positive 
(possibly infinite) variance. Here we have introduced two doubly indexed sequences to prevent any 
possible confusion regarding dependencies of various demand realizations. Let G be an independent 
geometrically distributed r.v., where P(G = k) = 2~^,k > 1. As a notational convenience, let us 
define all empty sums to equal zero, empty products to equal one, ^ = 0, 0 ( 1 ) denote the all zeros 
(ones) vector, and 1(A) denote the indicator of the event A. Let L > 1 be the deterministic lead 
time of the regular source (R), and Lq ^ 0 the deterministic lead time of the express source (E), 
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where L > Lq + 1. Let Cr^Ce be the unit purchase costs of the regular and express sources, and 
h, b be the unit holding and backorder costs respectively, with c = Ce — Cr > 0. In addition, let It 
denote the on-hand inventory at the start of period t (before any orders or demands are received), 
and qf{qf) denote the order placed from R(E) at the beginning of period t. Note that due to 
the leadtimes, the order received from R(E) in period t is q^E{qf_E^). As we will be primarily 
interested in the corresponding long-run-average problem, and for simplicity (in later proofs), we 
suppose that the initial conditions are such that (s.t.) the initial inventory is — 
initial orders have been placed from either R or E. Indeed, the associated system state will prove 
convenient to use as a “regeneration point” when analyzing certain Markov chains which arise in 
our proofs, where we note that the geometric distribution allows us to preclude certain kinds of 
pathological periodic / lattice behavior which might otherwise interfere with proving the existence 
of relevant stationary measures. We note that although assuming such a convenient randomized 
initial condition simplifies several technical proofs along these lines, such an assumption is not 
strictly necessary for our analysis, since the associated long-run average problem is insensitive to 
the particular choice of initial conditions. 

As a notational convenience, we define qj} = q^ = 0,k <0. Eor t = 1,... ,T, the events in period 
t are ordered as follows. 

• Ordering decisions from R and E are made (i.e. qf,qf are chosen); 

• New inventory qf_E + qf-Lo delivered and added to the on-hand inventory; 

• The demand Dt is realized, costs for period t are incurred, and the inventory is updated. 
Note that the on-hand inventory is updated according to R+i = A A q^_E + qf-Lo ~ and may be 
negative since backorder is allowed. 

We now formalize the family of admissible policies 11, which will determine the new orders placed. 
An admissible policy vr consists of a sequence of measurable maps {ft,t > 1}, where each is a 
deterministic measurable function with domain M^+^o+i range M+’^. In that case, for a given 

policy TT, the regular order placed in period t equals ... ,qf_i,qf_j^^,... ,qf_^,It)] while 
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the express order placed in period t equals ■ ■ ■ i • • • ’ and 11 denotes the 

family of all such admissible policies vr. 

Let G{y) be the sum of the holding and backorder costs when the inventory level equals y in 
the end of a time period, i.e. G{y) = hy^ + hy~, where x'^ = max(x,0), x~ = max(—x,0). Here we 
note that G is convex and Lipschitz, and for G R, 

|G(x) — G(y)| < max(6, h)|x — y| , |G(x)| > min(6,/i)|x|. (1) 


For t > Lq +1, let Ct be the sum of the holding and backorder costs incurred in time period t, plus 
the ordering cost incurred for orders placed in period t — Lq, i.e. Ct = CRq^_ -Lo + ^Eqf-Lo + G{It + 
qflL + qf-LQ ~ Dt)- We note that charging in period t for orders placed in period t — Ln is a standa rd 


‘accounting trick” in the inventory literature to simplify various notations (cf. 


Zipkinl (j2008all L 


and for the problems considered without loss of generality (w.l.o.g.). To denote the dependence 
of the cost on the policy vr, we use the notation . Let C{'k) denote the long-run average cost 


incurred by a policy vr, i.e. (^(vr) = limsup^.. 


E 


t—L q 4*1 


ncr 


, where we again note that starting 


the relevant sum at f = Lq -|- 1 (as opposed to f = 1) is w.l.o.g. for the problems considered. The 
value of the corresponding long-run average cost dual-sourcing inventory optimization problem is 
denoted by OPT(L) = inf,ren ^(vr). 

Before proceeding, it will be useful to apply certain well-known reductions to the problem 
at hand, where we note that simil ar reductions are known to hold fo r many classical inven- 


tory problems with backlogging fcf. 


Karlin and Scara ( 1958 1. 


Scara ( 1960 1L First, as stated in cf. 


Sheopuri. Janakiraman and Seshadril (j2010l ). for the long-run average cost problems which will be 


the focus of our analysis, any problem with c/j > 0 can be transformed into an equivalent prob¬ 
lem with Ce = 0. As such we assume throughout that = 0. Let us define the so-called expedited 
inventory position at time t>l as It = It + Yfk^t-ho + Yfk=t-L° which corresponds to the 
net inventory at the start of period t plus all orders to be received in periods t,..., t -|- Lq (which 
were placed before period t), and the truncated regular pipeline at time t as the {L — Lq — 1)- 


dimensional vector 7^* = {qf_E+Lo+i, ■ ■ ■ ,qf-i), with III = qjfE+Lo+k^k = 1,..., L - Lq - 1. Let H 
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denote those policies belonging to vr with the additional restriction that the new orders q^,qf are 
measurable functions of only It,TZ^. More formally, vr £ 11 if there exists a sequence of measurable 
maps {ftd > 1}, where each is a deterministic measurable function with domain and 

range ]R+’^, s.t. the regular order placed in period t equals It) and the express order placed 

in period t equals 

Note that Ii = — and TZ^ = 0. Also, for any policy vr £ ft and t > 1, it holds that 

it+i = It + qf + IZX - Dt, = IZl+i for fc £ [1,L - Lo - 2], and = qf. Furthermore, 

for all t > Ln + 1, Cl = Gjlt^Ln + qf - rg - V, *,. A) + cqf_Lf^. Then the following is proven in 


SheoDuri. Janakiraman and Seshadril (120101 1 


Lemma 1 (jSheopuri. Janakiraman and Seshadril (j201C)fl Lemma 2.1). inf,ren = 


inf„.gn C(7r), i.e. one may w.l.o.g. restrict oneself to policies belonging to II. 


For the remainder of the paper, we thus consider the relevant optimization only over policies 
belonging to 11, i.e. 

OPT(L) = inf ^(Tr). (2) 

TT^n 

For a given policy vr £ II, let denote a r.v. distributed as the truncated regular pipeline 

(expedited inventory position) at the start of period t under policy vr. Similarly, let qf’^{qf’^) denote 
the expedited (regular) order placed in period t, and suppose that all these r.v.s are constructed on 
a common probability space, and have the appropriate joint distribution induced by the operation 
of vr over time. 

2.1. TBS policy 

In this section, we formally introduce the family of TBS policies, and characterize the “best” TBS 
policy. A TBS policy tt^^s with parameters (r, 5) is defined (cf. JSS) as the policy that places a 
constant order r from R in every period, and follows an order-up-to rule from E which in each 
period raises the expedited inventory position to S (if it is below S), and otherwise orders nothing. 
More formally, under this policy qf = r, and qf = max(0 ,S — It), for all t. 
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Let = supj>Q (^jr — Ylil=i ^Lat case, it follows from the results of JSS that 


C{^r,s)= c(E[Zl]-r)+E 


io+i 


G[r^ + s-Y,D[ 




(3) 


Note that for each r, the minimization problem inf 5 gRC'( 7 rr_s) is equivalent to a standard one- 
period newsvendor problem. Furthermore, defining F°°{r) = inf 5 gRC'( 7 rj._s), it is proven in JSS that 
F°°{r) is c onvex in r o n (—o o,E[iJ]). Combining the above with standard resnlts for single-server 


queues (cf. 


AsmussenI (j2003ll l and ([I|), we conclude that there exists at least one pair {r*,S*) s.t. 


r* G argminQ<^<Ej^] F°°(r) and S* G argmin^^R (^(Trri^s); that this pair defines the TBS policy with 
least long-run-average cost; and that this pair can be computed efficiently by solving a convex 
program which is independent of the larger lead time L. 

2.2. Main result 

2.3. Additional definitions and notations 

Before stating our main resnlt, we will need several additional definitions and notations to describe 
various relevant quantities which will appear in our bounds on the optimality gap. For 9 >0 and 
e G (0,E[iJ]], let us define 


(/),( 6 ») = exp( 6 '(E[Z)]-e))E[exp(- 6 »A))] , 7 , = inf0,(6'), 

and de G argmin^^g (/>e(0) denote the supremnm of the set of minimizers of (j)e{9), where we define 
■de to equal 00 if the above infimum is not actually attained. Note that (j)ei9) is a continuous and 
conv ex funct i on of 9 on (0,oo), and right-continuous function of 9 at 0. In addition, it follows 
from iFollandl (jl999ll Theorem 2.27 that <^e(d) is right-differentiable at zero, with derivative equal 


to — e. We conclude from the definition of derivative and a straightforward contradiction argument 


that de > 0 and 7 ^ G [0,1). Let g = inf^-gRE G Ll'j > 0, and U = G(7ro,o) = cE[D] -|- 

E[G(— LlQ], in which case it is easily verihed that g < OPT(L) < U for all L > Lg + 1. We 

also make the following additional definitions: 


Xq+i 


Po=E(L><E[D])g (0,1) , Po = (^Po(l-Po))^ e (0,1) , Qo =inf{x GM+:E(iJ<x) > ipo} G [0,E[D]), 
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% = infIE[|z-D|] >0 , Co = 7 ^ min(6,/i)po% , t/o = 64(Lo + 1) ^ E[Z)], 

zeK 240 min(o, h) 

Co = min ^E[Z)] — Qo, ^(? 7 oPo)^) 1 “ 2 ^^Co(C/o2^°+?7o + + l) ^£(0,1 — 2 ^)c(0,.002), 

Yo = 25ff“^(?7o2^o +max(6,/i)7,o??“^(l + Lo + 1. 

Our main result proves that the best TBS policy is asymptotically optimal as L —>■ oo, and provides 
explicit bounds on the optimality gap. 

, 3 *) 


Theorem 1. For all Lq> 0, (0,1), and L> + YQe it holds that qpF{l) < 1 + 


e. 


Corollary 1. lim 


L—^c 


C(7r^«,g«) 


OPT(L) 


= 1 . 


3. Proof of Theorem [T] 

3.1. Lower bound for the optimal cost 

In this sect ion, we prove a lower bo und for OPT(L) by extending the steady-state/convexity 


approach of 


Xin and Goldberg (120151) to the dual-sourcing setting. We note that here our lower 


used in 


Xin and Goldbere 

(2015 

) which were of a static nature. As in 

Xin and Goldbere 

(2015) 


will proceed by relating the “long-run behavior” of “an optimal policy” to a certain TBS policy. At 
a high level, we will combine convexity and the conditional Jensen’s inequality with the fact that 
the r.v.s corresponding to (appropriately defined stationary versions of) the different components 
of the truncated regular pipeline vector (under the optimal policy) have the same mean, which will 
(approximately) coincide with the constant order from R in our TBS policy. Furthermore, when we 
apply the conditional Jensen’s inequality to certain terms corresponding to (appropriately defined 
stationary versions of) the expedited orders under the same optimal policy, the resulting terms 
will be suitably measurable functions of past demands, which will (approximately) coincide with 
the amount of inventory ordered from E in our TBS policy. 


3.1.1. Connecting to a stationary problem. As in 


Xin and Goldberd (120151) . our pro 


gram immediately encounters a technical problem. Namely, the natural way to analyze the “long- 


run behavior” of an optimal policy is through the steady-state distribution of the Markov chain 























12 


Xin and Goldberg: Asymptotic optimality of TBS policies in dual-sourcing inventory systems 


induced by this policy. However, it is not obvious that this steady-state exists. Actually, it is not 
even obvious that there exists a stationary optimal policy (so that the dynamics even dehne a 
Markov chain), nor even that there even exists an optimal policy at all (as opposed to it only 


being approach 
tory models in 


red). Although such questions have bee n rigorously analyzed for simpler inven- 


Huh. Janakiraman and Naearaianl (|201lh . such questions have not been rigorously 


answered for the setting of more complica ted dual-sourcing models. We note that although in 


SheoDuri. Janakiraman and Seshadril (120101 1 it is stated in passing that many of the same results 


should extend to the dual-sourcing setting, no proofs are provided, and the explicit assumptions 
needed for such a t ransference are n ot clarihed. A similarly terse exposition on related ques¬ 
tions is provided in 


Huaetal 


(j20l4 l. Furthermore, in none of these works is the question of 


existence of and c o nverg ence to relevant stationary measures discussed. To overcome this, as in 


Xin and Goldberg (j2015[ l. we first observe that we will not actually need a random vector which 


is truly the steady-state of the aforementioned Markov chain (which in principle may not exist), 
but only need to demonstrate the existence of a random vector which has several properties that 
we would want such a steady-state (if it existed) to have. We now show the existence of such a 
random vecto r. We note that alt h ough closely related questions have been studied in the MDP 


literature (cf. 


Arapostathis et al. 


(Il993l ll. and perturba tive a p proac hes similar to the approach 


we take in our own proof are in general well-known (cf. 


Filaii ( 200711 1. to the best of our knowl¬ 


edge the desired result does not follow directly from any results appearing in the literature. As 
such, we include a proof for completeness in the technical appendix Sectio n [5l We note that 
here t he relevant analysis is considerably more challenging than that given in 


Xin and Goldberg 


(120151), due to the fact that in the dual-sourcing setting the inventory level is unbounded from 


below, and the associated ordering levels are not k nown to be uniformly bounded (in contrast 


to the set ti ng con sidered in 


m 


Xin and Goldberg] (j2015l l for which such bounds were already proven 


ZipkinI ()2008al lh Furthermore, although several bounds exist in the dual-sourcing literature 


relating order levels under an optimal policy to the inventory level at the time of ordering (cf. 
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SheoDuri. Janakiraman and Seshadril (1201 




Hua et al 


()2014fl ), it seems that due to the inventory 


being unbounded below none of those bounds are suitable for our purposes. It is also worth noting 
that our approach is able to side-step many of the complexities and additional assumptions (e.g. 
finite second moment or bounded support) often required when analyzing inventory models which 
are unbounded from below. 

We defer all relevant proofs to the technical appendix Section [5l For two r.v.s X,Y, let X r^Y 
denote equivalence in distribution. Before stating our result, for the sake of building intuition, we 
first describe what the various r.v.s appearing in our result would correspond to “if we were to 
assume” (which we do not, i.e. it is not an assumption of our main results) that there exists an 
optimal policy which is stationary, and whose corresponding Markov chain converges to a steady- 
state distribution, i.e. the truncated regular pipeline and expedited inventory position converge 
in distribution under the operation of this optimal stationary policy. In that case, our theorem 
contains an (L — Lq — I)-dimensional random vector {L — Lo)-dimensional random vector 

q*’^, and a r.v. X*’^, which may be interpreted as follows. Suppose one has been operating under 
this stationary optimal policy for a long time, say up to some very large time T, at which time the 
system is essentially in steady-state (again we note that this discussion is purely for the sake of 
building intuition, and our main results do not actually assume this). Then x*’^ corresponds to the 
steady-state truncated regular pipeline vector under this optimal policy (at time T), i.e. x*’^ is the 
regular order which enters the expedited inventory position in period T + i — q*’^ corresponds 

to the steady-state vector of expedited orders to be placed over the next L — periods under this 
optimal policy, i.e. q*’^ is the expedited order which enters the expedited inventory position in 
period T + i — Finally, corresponds to the steady-state expedited inventory position under 
this optimal policy (at time T). 


Theorem 2. For all Lq >0 and L > Lq + 1, one may construct an L — Lq — 1-dimensional 
random vector x*’^> o,n L — Lq- dimensional random vector q*’^, and a random variable T*F^ as 
well as {Di,i > 1}, on a common probability space s.t. the following are true. 
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(i) W.p.l *5 non-negative. Also, is independent of {Di,i >1}, and q*A is 

independent of {Dj,j > i} for i E [1, -L — Lq] . 

(a) x*i'^^Xi^fori^[l,L-LQ-l], and q*A r.--for i ^[l,L - Lf\. 

(Hi) For all k g[1^L — Lq], 


k-l 


fc + Z/Q 


Lq + I 




2=1 


i—k 


2=1 


(iv) {x*’^,1*'^) has finite mean. 

(v) E[x*A+E[qf^]=E[D]. 

(vi) 

OPT{L)>c{E[D]-E[xl'^]) +E 


Lq + I 


a{r-‘- + g‘X 


T.n, 


3.1.2. Vanishing discount factor approach. Although Theorem (1^ relates OPT(L) to 
a certain expectation, this expectation (as written) is not immediately amenable to analysis. To 


remedy this, we introduce a discount factor a to im_ 
factor” approach to analyzing infinite-horizon MDP (cf. 


plement the so-called “vanishing c 


Huh. Janakiraman and Naearaian 


iscount 


20111 )), 


which will allow for a simpler analysis when we pass to the limit as L —>■ oo. Indeed, this discount 
factor will help us to analyze the lower bound which arises when we apply the conditional Jensen’s 
inequality, as this lower bound will itself involve the solution to a non-trivial multi-stage dynamic 
optimization problem. We note that the lower bo und which arose when re lated techniques were 
applied to single-sourcing systems with lost sales in IXin and Goldberg ((20151) only involved a static 


optimization problem, and thus no such discount factor was introduced. In particular. Theorem [2] 
immediately implies the following corollary. Let r^ =E[xi’^]. 

Corollary 2. For all A > 0, T > A -I-1, and a G (0,1), 


OPT{L) > c{E[D]-rL) + 


1 — a 

1 — 




fc=i 

L-Lo 


> 


c(E[T)]-ri)+ (!-«) a'^-^E 


/ io+l N 

V 2 = 1 / 

/ k — 1 A:+Lq 

G + xT" - A) + ql’^ -Y^^ 


fe = l 


2=1 


i—k 
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3.1.3. Applying the conditional Jensen’s inequality and relating to a single-source 
inventory model. We now apply the conditional Jensen’s inequality to Corollary [21 which will 
allow us to lower-bound OPT(L) by the optimal value of a certain finite-horizon single-source 
inventory model with backlogged demand. We will then relate this finite-horizon problem to an 
associated infinite-horizon problem, which has an optimal stationary policy. Furthermore, we will 


connect the behavior of such an optimal stationary policy to the performance of an associated TBS 
policy, ultimately allowing us to prove our main results. In particular, it follows from Theorem [2] 
and the independence structure of the relevant r.v.s that for fe S [1,T — Lq], 

k—1 fc-j-Lp 


E 






I *,Z/ 

+<ik ' 


E 


D, 


D 


[/c+Lq] 




i—k 


equals 


k—1 fc+Lo 

E[X*'^] +A- 

2=1 i—k 

Further combining with Corollary O the convexity of G, and Jensen’s inequality for conditional 


expectations, we obtain the following result. 


Proposition 1. For any a G (0,1) and L > Lq + 1, OPT{L) — c (E[iJ] — r^) is at least 

/ k — 1 

G ( E[X*’^] - (Lo + l)rL + - (A - r^) 

^ 2=1 

+ E[g:’^|Afc-i]]- 

1. / 


L-Lo 

(l-a) a'^-^E 

A:=l 


(4) 


Note that ([1|) is the discounted cost incurred (during periods Lq + 1, ..., L) by the policy ordering 
E[g*’^|iJ[i_i]] in period i, of a single-sourcing L-period backlog inventory problem with unit holding 
cost h, backorder cost b, zero ordering cost, discount factor a, i.i.d. demand distributed as D — r^ 
(which we note can be positive or negative), lead time Lq, and initial inventory p osition (initial net 


i nvent ory plus all entries of the initial pipeline vector) E[T*T] — (Lg -|- l)rL (cf. 


Karlin and Scarf 


()l958f) ), multiplied b y (1 — a). Such m o dels, ar id their optirn a. 


in the literature (cf. 


Karlin and Scarf! (1958) 


ZipkirJ ( 2000 ). 


policies, have been stu d ied in -depth 


Fleischmann and Kui 


are well-understood (especially for the case of non-negative demand, cf. 


<J2003)), and 


Zipkin (2000)). Let 11 
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denote the family of all feasi ble non 


(as it is typically defined, cf. 


antici native policies for the aforementioned inventory problem 


ZinkinI (|20001) i. For tt € 11, initial inventory position x G M, r € M, and 


i>l, let C^{r,x) denote the cost incurred by policy tt in the aforementioned inventory problem 
in period i + Lq, if the demand in each period is i.i.d. distributed as D — r (with the leadtime Lq 
and costs b, h as above). For x G M, r G M, a G (0,1), n > 1, let us define 


and 


V^{r,x)= infE 

ttGII 


V^ir,x)= infE 

TT^n 


^a^-^C:ir,x) 


2=1 




X 


i=l 


As a notational convenience, we define V°(r,x) = 0, V^{r,—oo) = infa,gR V)(*(r, x), V)(^(r, —oo) = 
inf^gg V)(^(r, x). Then combining the above, we derive the following lower bound for OPT(L). 

Lemma 2. For all Lq>0,L>Lq-{- 1, and a G (0,1), 


(5) 


( 6 ) 


A 


OPT{L) > c{E[D] - ri) + (1 - a)Vf-^°{rL, -oo). 


(7) 


3.1.4. Overview of remainder of the proof of our main results. The remainder of 
the proof involves a careful analysis of the right-hand-side (r.h.s.) of ([7]) as L —?> oo, and we now 
sketch an outline of our approach. First, we will prove that if is bounded away from E[D], then 
V^{rL, —oo) — —oo) can be suitably bounded by a function of L which converges to 0 

as L —>• oo. We then observe that the infinite-horizon problem associated with —oo) has an 

optimal policy which is stationary, Markov, and of order-up-to type. Furthermore, the stochastic 
process induced by this optimal policy will be equivalent to that induced by a corresponding 
TBS policy, but possibly initialized not according to the stationary distribution of the associated 
inventory process. Then we prove that is indeed bounded away from E[D], since otherwise we 
can use the theory of random walks to derive a contradiction (as OPT(L) would be strictly greater 
than U). Finally, we combine these facts to bound various error terms under a suitable choice 
of a (which converges to 1 as L —^ oo), including a term resulting from the difference between 
the performance of the same TBS policy under different initializations, to prove our main result 
Theorem [TJ 
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3.2. Proof of Theorem [T] 

We now complete the proof of Theorem [1] by formalizing the argument sketched at the end of 


Section 13.11 Such 


tpry c o ntrol pro 




Feinberg 


ems 


(1201 ill . 


(c 


Idehartl (| 196311 


n the li t eratu re 


SennottI ( 1989fl 


on IV 


Huh. .Tanakiraman and Naearaianl (|201lll l. We note that the somewhat 


DP an d infinite-horizon inven- 


Sc.hal (1199311 ■ 


Fleischmann and Kuik 


non-standard aspect here is that the demand in each period is distributed as D — and thus may 
be negative. As such, the original arguments typically used to analyze the relevant quantities and 


prove related interchange-of-limits results (cf. 


IglehartI (Il963ll ) do not directly apply. The possibility 


of negative demand also makes the verifica tion of the cond i tions o f gene ral theorems which validate 


such bounds and interchange-of-limits (cf. 


SennottI ( 1989 1 


when these theorems are customiz e d to t he inventory setting (cf. 


Schall ( 1993111 somewhat involved, even 


Parker and Kaouscinskil ()2004ll . 


Huh. Janakiraman and Naearaianl (120111 11. We note that the verification of closely related results 


have arisen recently in the context of analyzing inventory systems with r eturns, which reduce to 
standard inventory systems where demand can be positive or negative (cf 


(120031 11. However, those results (which verify the technical conditions of 


Fleischr 

nanu a,ud Kuik 

Sennott 

(1989 

)) do not 


seem to extend immediately to our case, and further seem to require that the demand and ordering 
quantities take integer values. In light of the above, and for the sake of clarity and completeness, we 
now provide a self-contained proof of all necessary bounds, which (combined with Lemma ED will 
complete the proof of our main result Theorem [TJ We defer most proofs to the technical appendix 
Section El 


We begin by stat ing some well-kuowu p 


the results of JSS, 


Karlin and Scara ( 19581 ) and 


operti es of 


.x l and V^(r, x), which follow from 


Scara (|l96C)fl . We note that although in some 


ca ses the proofs there 


m 


Hevman and Sobell (Il984ll and 


are o nly explicitly given for the 


case o f non-negative demand, as noted 


Fleischmann and KuikI (120031 1. the arguments carry over to the 


general case (in which demand may be negative) with only trivial modification. 
Lemma 3 (JSS, Scarf { 1960 D. For all a € (0, l),r’,x € M, and n>l, 

+ aE [K-i(r,y-(D^^+„-r))] y 


KiFx) = inf E 
y>x \ 


Lo+n 

G{y-Y,iD,-r)) 

k—n 
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Furthermore, V^{r, x) is: a convex (and thus also continuous) function of x on M for each fixed n, r; 
a continuous function of r on M for each fixed n,x; an increasing function of x on R for each fixed 
n,r; and an increasing function of n on for each fixed x,r. In addition, the infinite-horizon 
problem stated in the r.h.s. of admits an optimal stationary Markov policy. 

Next, we bound V^{r,x) — V)f{r,x), and combine our bounds with Lemma [3] to derive some 
useful properties of (r, x) and the associated optimization problem. We defer all proofs to the 
technical appendix Section [5l Let 5'„(r) = 4(Lo + + E[D])(1 — 

Lemma 4. For a € (0, 1), r, x G R, and n > 1, 

0 < Vffi{r, x) - l/”(r, x) < max(6, h) {Sa{r) + |x| + |r| + E[L>]) (1 + Lq + n)(l - (8) 

and V)fi{r,x) = lim„_>oo ya{f-,x). Furthermore, for a G (0,1) and r G R, V^{r,x) is a finite-valued, 
convex, and non-decreasing function of x on R. Letting Sff{r) denote the supremum of the set 
of minimizers (in x) of V^{r,x), it holds that |5'“(r)| < Sair), and the infinite-horizon problem 
stated in the r.h.s. of admits an optimal stationary base-stoek policy, with order-up-to level 
S)f{r). In addition, for Lq>Q, L > Lq + 1, and a G (0,1), 

OPT{L) > c(E[D] - rA + (1 - a)4C(^L, (^l)) - Uo{l - a)-^La^-^o. (9) 


We now formally define the Markov process representing the inventory position process under 
such an optimal stationary base-stock policy, initialized in state Sff{rL). Let Sa^L = For 

r G [0,E[D]] and y G R, let {Xl’^,k > 1} denote the following Markov process. equals y. For 


all fe > 1, Xlf^ = max + r- Dk,y). Let Wf = “ F'j), Z^ = max,g[o,fe_i] W[, = 


A 


supj>o Wf, Ml = E[Z^], = E\Z ff\. It follows from the well-known analysis of the single-server 


queue using Lindley’s recursion (cf. 


AsmussenI (j2003[L that Xl'^ ~ y -t- and Xffi = linifc^oo Xl’ 


r,y 


(in the sense of weak convergence) is a well-defined r.v. distributed as y-|- Z^. 


Combining these definitions with Lemmas [3] and SI we conclude the following. 
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Corollary 3. For Lq > 0,L > Lq + 1, and a € (0,1), 


Lq + I 


OPT{L) > c{E[D] - rA + (1 - «) (A' - ^l))] “ Uoil - a)-^La^-^°. 


fc=i 


We now briefly review some useful properties of ZJl, which we will use to complete the proof of 
our main results. These properties follow by combining generally well-known result s for generatin g 


functions 


Kingman! ( 1962ll . 


arge 


d eviation 


Follandl (jl999fl . 


singl e 


server queues, an c 


AsmussenI ((20031), 


recurrent ran Horn walks (cf. 


Xin and GoldbergI (|2015ll l. and we omit the 


Spitzeri (jl956ll . 


details. 

Lemma 5. For all r > 0, {M[, k>l} is non-decreasing, = limfc_>oo MJi, and for all i > j > 1; 
M[ — MJ = Yfk2j fe“^E[max(0, Wf)]. If there exists e G (0,E[F>]) s.t. r < E[F] — e, then < oo, 
and — Mf < (de(l — yj) ^ 7 " for all n>l. 

Finally, we will also need the following corollary (of Lemma [5]), which shows that r^ is uniformly 
bounded away from E[F)] in an appropriate sense, and whose proof we again defer to the technical 
appendix Section [5l 

Corollary 4. For all L > + Lq 4-1, it holds that r^ < E[F)] — cq. 

We now complete the proof of our main results. 

Proof of TheoremU\ It follows from ((3|) that for all a G (0,1), 


^(^r^,S^.L + (Lo + l)rA = c{E[D]-rL)+E 


Lq + I 


G[S^,L + Z:f + {Lo + l)rL-J2D: 


c{E[D] - rA + (1 - a) ^ A’^E 




2 = 1 / 
^0 + 1 


G[S^,L + Zl^-^{D:-rA 


2=1 


Combining with Corollaries O and [H Lemma El (HD, and the fact that L > Cq ^ + Lo + 1 , we conclude 
that for all a G (0,1), C'( 7 rr*,s*) — OPT(F) — Uo(l — is at most 


(l-a)^A 1 


E 


fc = l 


/ ^0 + 1 

Gis^,i^ + Z:f-Y,W-rL) 
V i=l / 


-E 


/ Lo+l ' 

V i=l / 


< 


max( 6 ,/i)(l-a)^A ^ 7 ,^^ 


fe=l 


= max(6,/i)7,Q(i9,g(l-7,J) “ 


7.0 « 


< (l-a)max(6,/i)7,^d^^ (1-7,J . 
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We conclude that for all a€ (|, 1), C{TTr*,s*) — OPT(L) is at most 
Uo2^°{l - a)~^La^ + (1 - a) max(6, 

As L> €q^ + Lq + 1 implies L > 100, which itself may be shown to imply that slsMhi ^ may 
set a = 1 — ^ Then applying the fact that 1 — a < exp(—a), we conclude that 

C{TTr-,s *) - OPT(L) < {Uo2^° + max(6, - 7eo)”^) • 

As for all L > 1, combining with the fact that OPT > g and a straightforward calcu¬ 

lation completes the proof. □. 

4. Conclusion 

In this paper, we proved that when the lead time of the express source is held fixed, a simple TBS 
policy is asymptotically optimal for the dual-sourcing inventory problem as the lead time of the 
regular source grows large. Our results provide a solid theoretical foundation for several conjectures 
and numerical experiments appearing previously in the literature regarding the good empirical 
performance of such policies. Furthermore, the simple TBS policy performs nearly optimally exactly 
when standard DP-based methodologies become intractable due to the curse of dimensionality. In 
addition, since the “best” TBS policy can be computed by solving a convex program that does not 
depend on the lead time of the regular source, and is easy to implement, our results lead directly to 
very efficient algorithms with asymptotically optimal performance guarantees. We also explicitly 
bound the optimality gap of the TBS policy for any fixed lead time (of the regular source), and 
prove that this decays inverse-polynomially in the lead time of the regular source. Perhaps most 
importantly, since many companies are already implementing such TBS policies, our results provide 
strong theoretical support for the widespread use of TBS policies in practice. 

This work leaves many interesting directions for future research. First, it would be interesting 
to further investigate the rate of convergence to optimality of TBS policies as the lead time grows 
large, especially in light of their use in practical settings. Although we have not optimized the 
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explicit bounds which we have proven on the optimality gap, we suspect that proving significantly 
stronger (e.g. exponentially decaying) bounds will require the development of new techniques. For 
example, when we apply the conditional Jensen’s inequality to lower bound the optimal value by 
a certain single-sourcing problem in Section 13.1.31 our current approach does not incorporate the 
fact that E[g*’^] is the same for all i, instead only using the fact that is a measurable 

function of It seems plausible that incorporating this “stationary expectations” property 

may be a promising approach here. Previous bounds from the literature on the rate of conver- 
gence of finite hori z on inventor y optimization problems to their infinite horizon counterparts, e.g. 


Hordiik and Tiinifj ( 1974 


197511 . may also be helpful. 


Second, and related to the aforementioned discussion as regards the rate of convergence to opti¬ 
mality of TBS policies, it would be interesting to identify other more sophisticated algorithms 
which perform better for small-to-moderate lead times, yet remain efficient to implement. Indeed, 
it remains an interesting open question to better understand the trade-off between algorithmic 
run-time and acheivable performance guarantees in this context, i.e. how complex an algorithm is 
required to “exploit” the weak correlations which persist even as the lead time grows large. In the 


context of dual-sourcing, pote ntial algorithms here inc 


ing policies recently studied i n 


ude: t 


le so-called dual-sourcing smooth- 


Boute and Van MieghemI ((20151); affine policies more generally (cf. 


Bertsimas. lancu and Parrild ((20101)), of which dual-sourcing smoothing policies are a special case; 


the single index and dual index po 


i cies di scussed earlier; or the dual-balancing policies analyzed in 


Levi. Janakiraman and Naearaiarj (|2008[1 . It would also be quite interesting to analyze “hybrid” 


algorithms, which could e.g. solve a large dynamic program when the lead time is small, and 
gradually transition to using simpler heuristics as the lead time grows large; or combine different 
heuristics depending on the specific problem parameters. In the context of the above conversation 


on optimality gaps, we do remind the reader that for any fixed regular lead time a TBS policy is 
not exactly optimal except in some very special cases (cf. JSS), and that our results (and associated 
insights) should always be applied with care. 
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On a final note, combined with the results of iGoldberg et alJ (j2015f) and IXin and Goldberg 


(120151 1 ■ our methodology lays the foundations for a completely new approach to analyzing inventory 


models with large lead times. So far, this approach has been successful in yielding key insights and 
efficient algorithms for two settings previously believed intractable: lost sales models with large 
lead times, and dual-sourcing models with large lead time gap. We believe that our techniques have 
the potential to make similar progress on many other difficult supply chain optimization problems 
of practical relevance in which there is a lag between when policy decisions are made and when 
those decisions are implemented. This includes both more realistic variants of the lost-sales and 
dual-sourcing models considered so far (e.g. models with distributional dependencies, parameter 
uncertainty, complex network structure, and more accurate modeling of costs), as well as funda¬ 
mentally different models (e.g. inventor y systems with remanufactu ring when the manufactured 


and remanufactured lead times differ, cf. 


lost sales and positive lead times, cf. 


Zhou. Tao and Ghad (2011); multi-echelon systems with 


Huh and JanakiramanI ((201^; or models with perishable 


goods). In closing, we note that our approach can more generally be viewed as a methodology 
to formalize the notion that when there is a high level of uncertainty and randomness in one’s 
supply chain, even simple policies perform nearly as well as very sophisticated policies, since no 
algorithm can “beat the noise”. Exploring this concept from a broader perspective may be fruitful 
in yielding novel algorithms and insights for a multitude of problems in operations management 
and operations research. 
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5. Technical Appendix 

5.1. Proof of Theorem [2] 

5.1.1. Overview of proof. Before providing a formal proof, we first p rovide an intuitive 


overvi ew, noting that our proof is similar to several proofs in the literature (cf. 


Xin and Goldberg 


((20151) and the references therein). We proceed by constructing a sequence of random vectors, one 


for each sufficiently small e > 0, and later take an appr opriate weak limit ( whic. 


vector satisfying the conditions of the theorem). As in 


1 will become the 


Xin and Goldberg ((20151), given e > 0, we 


will pick a sufficiently large time Tg s.t. the expected performance of an approximately optimal 
(possibly non-stationary) policy up to time Tg is “close” to OPT(L). We then further prove 
the existence of a time Ti_g “near” Tg s.t. the expedited inventory position and truncated regular 
pipeline vector (under policy vr*’'^) are “well-behaved” at time Ti g, which will be necessary for 
our later arguments, as it will allow us to bound the time needed to “clear the system” if one 
orders nothing from that time onwards. We then construct a “modified policy” and associated 
Markov chain, which behaves exactly like the expedited inventory position and truncated regular 
pipeline vector under on [l,Ti_g], but after that time forces a sequence of ordering decisions 
which cause the associated inventory position and pipeline vector to re-enter a state distributed 
as its initial state, at which time the entire process restarts. We note that due to the process 
being unbounded from below, here the special initialization involving — ^ will prove useful. 

This regenerative structure, combined with our careful selection of Ti_g, will allow us to apply 
the theory of regenerative processes to prove the existence of a stationary distribution for the 
associated Markov chain, which we will prove to satisfy the conditions of an “approximate” version 
of Theorem [2] (with the approximation error parametrized by e). Taking a weak limit (as e 0) 
of the associated sequence of random vectors yields a random vector satisfying the conditions of 
Theorem [21 completing the proof. 

As all results in this subsection will be stated for a hxed Lq >0 and L > Lq 1, we assume 
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these parameters are fixed and supress any associated notational dependencies. For e > 0, let 
denote some (fixed) policy in ft s.t. OPT(L) > Let ^ + 2E[D]), and 

U 2 ,e = (L + + 2 ) (h + 6 + c)[/e. It follows from the definition of limsup that there exists Tg > 

100 (C/ 2 ,g + (C/ + l)L)e-i s.t. C{Tr*’^)>T-^Y.LLo+i^[^f’l “I for all T > (1 - e)Tg - L. 

5.1.2. Existence of time Ti g, near Tg, at which inventory and pipeline are small in 
expectation. We first prove that there must exist a time “close to” Tg at which the expedited 
inventory position and truncated regular pipeline vector (under policy are “small” (in absolute 
value) in expectation. 


Claim 1. For all e€ there exists Ti^^€[{l — e)T^ —L,T^] s.t. 

C{7r*n>Trf ^ 

i—L q +1 

for all A: G [0, L — 1], 


c: 


e 

2 ’ 


+C+Ll.I1 < + (io + 

and for all k € [1, L — Lq — 1], 


2UL 


( 10 ) 


( 11 ) 


jg[^-‘’bTi,g ^°]< +2LE[T)]. (12) 

Proof of Claim[l\ Note that we may (deterministically) partition the time interval [(1 — e)Tg — 
L,Tg] into [^1 disjoint intervals each of length L, plus an additional disjoint time interval of 
length possibly less than L. Suppose for contradiction that of these disjoint time intervals of 
length L, there does not exist a single such interval I s.t. 


mtL + + (^0 + 1)E[T] for all t € /. (13) 

t XlLlll 1^5 ' ^ ) 

In that case, by the triangle inequality, each of these \^'\ intervals contains at least one time 
period t for which 


nif-L+<lVrf- 


t 

E 

i—t — LQ 


All> 


2UL 


emin(6, h) 
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Hence by ([T]), non-negativity of costs, and the definition of we conclude that 

2 

3 


C(7r*-^) > 


> 


min(6,/i) X ^ X 


e 

2 


= 2U- 


> 




and thus OPT(L) > — ^ > U, a contradiction. Let t' denote the left end-point of the corre¬ 

sponding interval satisfying fll3p . whose existence we have just proven by contradiction (in case of 
multiple such intervals, take the left-most such interval). Now, further suppose for contradiction 
that there exists A: G [1, L —Lq — 1] s.t. E[7^^ ’ -|-2LE[Z)]. Then it would follow from 

the inventory update dynamics, non-negativity of order quantities, and the triangle inequality that 
^[\it*+k-LQ + ^t'+k-LoW > emin(b fe) which would itself contradict the definition of t'. 

Combining the above, and setting ^ = t', completes the proof. □. 


5.1.3. Statement of approximate form of Theorem [2l We now formally state the afore¬ 
mentioned approximate version of Theorem [2J 


Lemma 6. For all e G (0,min (hu)), one may construct an L — Lq — 1-dimensional random 
vector an L — Lq- dimensional random vector and a random variable as well as 
{A, i > 1}; on a common probability space s.t. the following are true. 

(i) has finite mean, and w.p.l (x*’%H*’'^) non-negative. Also, is inde¬ 

pendent of {Di,i >1} and q*A is indepenent of {Dj,j > i} for f G [1, L — Lq]. 

(a) X*’" for i£[l,L-L q- 1], and q*F qf" for i e [1, L - Lq]. 

(in) E[xr]+nqr]=nD]. 

(iv) For all k [1,L — Lq], 


A;+Z/q 

+ X*A - Di) ->r ql’" - ^ Di 

i—k 


k-l 

E( 

2=1 


Lq + 1 

,X*’^ + qr-Y^D,. 
2=1 


(v) 


G 



-qd 


Lq + 1 \ 

Ea 

i=i / 


OPT(L) >cE[gt’']+E 


— e. 
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5.1.4. Proof of Lemma [6] by construction of a Markov chain with an appropriate 
stationary distribution. We now construct an appropriate Markov chain which repeatedly mim¬ 
ics for blocks of time of length Ti g, and then (by a sequence of ordering decisions) brings the 
system back to a state distributed as its initial state (involving — ^-i)- This is accomplished 

by allowing for an extra “time-accounting” dimension in the state-space. While this “clock” is 
between 1 and Ti g, the Markov chain dynamics parallel those of the inventory and pipeline in 
TT*’*^ on [l,Ti_g], and the clock increases by one in each period. Whenever the clock reaches Ti_g, 
the Markov chain dynamics instead parallel those of a policy which first orders nothing until the 
truncated regular pipeline vector clears and the inventory position goes below 0, then places an 
expedited order to bring the inventory position to exactly 0, and finally orders nothing for an addi¬ 
tional geometrically distributed number of time periods, where this geometric idling will preclude 
any pathological periodicity that might otherwise arise (ensuring existence of relevant stationary 
measures). This brings the system back to a state in which the truncated regular pipeline vector 
is empty and the inventory position is distributed as — X]i=i at which time the clock restarts 
to 1 and the cycle repeats, which thus yields a regenerative process. We further note that in the 
associated Markov chain we will also keep track of the most recent expedited order, so that all 
relevant inventory and ordering costs can be expressed directly as a function of the state in the 
associated Markov chain. This will allow us to apply the theory of regenerative processes to prove 
that the expected value of an appropriate function of the corresponding steady-state vector bounds 
the average cost incurred by on [1,7);], which itself well-approximates OPT(L) (i.e. ensuring 
that Lemma 6.(j^ is satisfied). Combining the above will allow us to prove that this steady-state 
vector satisfies the conditions of Lemma [H 

Proof of Lemma\^ We construct an (L — Lq-|- 2)-dimensional discrete-time Markov process 
> 1} = QjjXj, Tj), t > 1}, where is an {L — Lq — l)-dimensional random vector, and 

QtAtJ Tj are random variables. Let {Bt,t > 1} denote an i.i.d. sequence of Bernoulli r.v.s, each 
of which equals 1 w.p. 1 and 0 w.p. |. Then {Y(,t > 1} evolves as follows. 
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= 0 = = , rr = l. 

For t > 1, the dynamics are as follows. 

• = Xl’+i for i e [1,L - Lo - 2] , = XI + Xi' + qt - A- 

— Tt\^=T^ + l , x'L-fo-i = fC}ix"’\It) > 9|+1 = /K4+l(x‘'’*+^A+l)• 

• If r/ = Ti^g and either x’^’^ 7 ^ 0 or > 0: 

— X^L-fo-l =qt+l=0 , T^+i =T,e- 

• If = Ti g and = 0 and < 0: 


-xnVi = o ’ < 11+1 => ^A = o. 

• If = 0: 

-xnVi = o ’ = 0 = 

One may easily verify the following properties of {Y^, t > 1}. Let z(x, y) = E[G(x + y- A)] + 

cy, and X^ denote a r.v. distributed as the time between the chain’s initial and second visit to a 


state s.t. Tj = 1. 

• It follows directly from the Markov chain dynamics that for all t > 1, xl’ ~ x'+i for i e 
[1,X-1]. 

• Conditional on the event {t^ = T A, the expected number of time steps until A = 0 is at most 
T 4- 

^ ■ 

• Conditional on the event {r/ = 0, it holds that (w.p.l) = 0) A+i = ~A) and 

the number of time steps until = 1 is distributed as G. 

_ 

• Conditional on the event {r/ = 1}, it holds that Y* ~ ( 0 , 0 ,and the joint 
distribution of {Y'^,i G [t,f+ Xe — 1 ]} is identical to that of {{TZ'^ ’"'^^qi ’ A A [l,Xi^j]}. 

• W.p.l Xj X Xl £, and E[X£] — Xi,, < X + + 2. 

• 0 < EEii AA, <it)] - E[EfA AA, <it)] < A,.. 

Combining with the basic definit ions associ a ted w ith the theory of regenerative processes (here 
we refer the interested reader to 


AsmussenI (j2003f ) for an excellent overview), we conclude that 
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1} is a discrete-time aperiodic regenerative process, with regeneration points coinciding 
with visits to states s.t. = 1. The n we may conclu d e the following fr om standard results in the 


theory of regenerative processes (cf. 


AsmussenI ( 20031) . 


ThorissonI (1992)) 


(a) {Yt,t> 1} converges weakly (as t —)■ oo) to a limiting random vector 

(b) Initializing the relevant Markov chain with initial conditions distributed as Y^ yields a sta¬ 
tionary Markov process {Yj,t > 1} = r^,t > 1}. Furthermore, it follows directly 

from the relevant Markov chain dynamics that we may construct {Y(,t >1} and {Zli,z > 1} on 
an appropriate probability space s.t. setting =X^, and = q\ for/c G [1,L —Lq] 

yields a random vector satsifying conditions (jl|) - (Ir^ of Lemma [H 


(c) = 

Further combining (jcj) with our previous bounds for E[Te],E[^^^ g^)], our definition of Ti_e, 

and some straightforward algebra (the details of which we omit) demonstrates that the same 
random vector exhibited in db]) also satishes condition of Lemma El completing the proof of the 
lemma. □. 


5.1.5. Proof of Theorem [2l We now complete the proof of Theorem [21 by taking an appro¬ 
priate weak limit (as e4-0) of the random vectors which we have proven to satisfy the conditions 
of Lemma El and verifying certain interchanges of expectation and limit (in inequality form). 

Proof of Theorem 0 To complete the proof of Theorem [21 we now prove that the sequence 
of random vectors {(x*’",q*’",T*’"),re > 2 -|- A} is tight. It follows from Lemma El (InH) and dvj) . 
non-negativity, the triangle inequality, the fact that OPT(L) < U, and ([H) that for all n>2 + ^, 


Combining with Lemma [6l (pl|) - (pEl) and non-negativit y, we cone 


(14) 


hence existence of at least one subsequential limit (cf. 


Billingsley 


ude th e desired tightness, and 


([1999IB (x*’°“,q*’°°,T*’°°). Let 


{ni,i >1} denote any fixed subsequence along which the sequence of measures converges to this 
limit, s.t. rii > 2 -|- A. That this weak limit satisfies Theorem 13 (0) - (pEl) follows from the dehni- 
tion of weak convergence. However, it will require somewhat subtle reasoning to prove (|r^ - (|^ . 
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since e.g. dominated convergence does not necessarily hold and thus one must take care when 
interchanging limit and expectation. Note that by the Skorohod representation theorem and con¬ 
tinuous mapping theorem, we may construct > 1} and < 1 *’°“! |X*’°°|) 

on a common probability space s.t. the corresponding sequence of random vectors converges 
almost surely (i.e. not only in distribution) to \^*'°°\)- As all associated r.v.s are non¬ 

negative, we may apply Fatou’s lemma to conclude that E[xt’°°] < hminfi_,,ooE[x*’^],E[g'*’°°] < 
* ^ 1 

liminfi_>oo ], and E[|X*’°°|] < liminfi_,,oo E[|X*’"i |]. Combining with Lemma [Hdlj) and dm]), 

as well as m, then completes the proof of Theorem [2J ()i^ . Combining with the already proven 
Theorem [21 (jn]) and (lui)) . with k = 2, yields Theorem [21 (j^ . Finally, we prove that the correspond¬ 
ing vector also satisfies Theorem [21 (|^ . Let = cql’" + G -|- ql’" — Z^o = 

-|- G + q*{°° — ■ The already proven weak convergence, and continuous map¬ 

ping theorem, implies that {Z.n.,z > 1 } conve rges weakly to Z^o- It follows from the Skorohod 


representation theorem (cf. 


BillingslevI f[l999l) i that we may construct {X„. ,z > 1} and Z^ on 


a common probability space so that this convergence holds almost surely (as opposed to only 
in distribution). Applying non-negativity and Fatou’s lemma, we conclude that on this proba¬ 
bility space, E[liminfi^oo < liminfi_>oo and hence (combining with the stated almost 

sure convergence) E[Zoo] < liminfi_>oo Combining with Lemma [H(|yj) , which implies that 

OPT(L) > E[Z„J — T for all z > 1, and the already proven Theorem [21 . completes the proof. 

□. 

5.2. Proof of Lemma [4l 

In preparation for bounding V^{r,x) — V^{r,x), we first bound the optimal value, and set of 
minimizers, of V^{r,x), uniformly in n. For a £ (0,1) and r £ M, let S^{r) denote the supremum 
of the set of minimizers (with respect to x) of V^{r,x), where we note that a straightforward 
contradiction demonstrates that 5'^(r) £ (— 00 , 00 ) for each a,n,r] and it follows from Lemma [3| 
that V^{r,—oo) = V^(r,SA^)'j. Then we prove the following uniform bounds. 

Lemma 7. 1. For a £ (0,1) and r, a; £ M, it holds that 


supFj(’'(r,x) < 2(Lo -|- 1) max( 6 , h){\x\ + |r| -|-E[i7])(l — a) 

n>l 
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2. For a G (0,1) and n>l, it holds that |5'a(?')| < Sa{r). 

3. For all y ■, Saif)] andn>l, 


E 


Lo+n 

G{y-Y,{D,-r)) 

k—n 


+ aE[V^ ^(r,y-(Z)ig+„-r))] > V;”(r, ^"(r)) + (Lo +1) max(6,/i)E[D]. 


4. For all L > Lq + 1, 


OPT{L) > c(E[D] - rA + (1 - a)Vf-^^ (r^, -5„(r J). 


Proof of Lemma^ By evaluating the policy which never orders, we conclude that for all a G 
(0,1), r,xGM, sup„>il/”(r,x) is at most E E*=i “ ^)) 

which by ([I]) is itself bounded by 


max( 6 ,/i)(|x| + |r|+E[I1]) ^^(i + Lo)a* ^ < 2(Lo + 1) max( 6 ,/i)(|x| + |r|+E[iD])(l — a) 

i=l 

The remainder of the lemma follows from ([T|) , Lemmas [ 2 ] and [3l and a straightforward calculation 
and argument by contradiction, and we omit the details. Combining the above completes the proof. 
□ 

With Lemma [7] in hand, we now complete the proof of Lemma 01 

Proof of Lemma\^ We first demonstrate that V^{r,x) = lim„_,,oo 14"(r, x), and complete the 
proof of dH]). The existence of the corresponding limit follows from the monotonicity (in n) guaran¬ 
teed by Lemma El That Vff°{r,x) > lim„^oo Va{r,x) for all a G (0,1) and r, x G M follows immedi¬ 
ately from the definitions of the associated optimization problems. To prove the other direction, as 
well as dHl), we note that for any fixed n > 1 , it follows from the convexity ensured by Lemma El that 
there exists an optimal policy tt for the problem stated in the r.h.s. of ([5]) of base-stock form, with 
order-up-to levels Ci,..., (i.e. order up to level Ci in period i if the pre-order inventory level is 

below Ci, otherwise order nothing). Furthermore, it follows from Lemma[71that maxi^i^,,, „ |Ci| < 
Sa{r). Now, consider the policy vr' ( for the problem stated in the r.h.s. of ([ 6 l) ) that orders up to 
level Ci in period i if the pre-order inventory position is below Ci and otherwise orders nothing in 
periods z = 1 ,..., re; and orders nothing in all remaining periods, irregardless of the inventory level. 
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It follows from a straightforward bounding argument that under policy vr', w.p.l the absolute value 
of the post-ordering inventory position in period i is at most |j;| -|- Sa{r) + {i — 1)|?’| + X]fc=i 
Thus by the dynamics of the underlying inventory problem and ([1]) , it follows that for alH > re -|- 1, 
Cf < max(6, /i)(|x| + 5'a(r) + (i+ Lo)|r| + (z + Lo)E[T>]). Thus since (by construction) Cf = for 
i G [l,re], it follows from definitions and straightforward algebra that 


E 




OO 

-V^(r,x) < max(6,/i)(5'„(r) + |x| + |r|+E[T)]) ^ (i + Lo)a*“^ 

i—n-\-l 

< max(6, h){Sa{r) + |x| + |r| +E[Z)])(1 + Lq + re)(l — 


This completes the proof of ([5]), and letting re ^ oo completes the proof that V^{r,x) = 
lim„^oo ic). Combining with Lemmas [3] and O the fact that convexity and monotonicity are 
preserved under limits, and a straightforward contradiction argument completes the proof of all 
parts of the lemma regarding properties of V^{r,x) and the associated optimization problems 
and optimal policies. Finally, we complete the proof of Q. It follows from Lemmas [3] and [3 
the already proven parts of Lemma 13 and the fact that Theorem [2] ensures G [0,E[L)]] that 
OPT(L) — c(E[Z)] — r^) is at least 

(I - a) {rL,S^{rL)) - 2max(6, h) (5„(E[L»]) +E[Z)]) (I + L)(l - . 

Combining with some straightforward algebra, the definition of and the already proven parts 
of Lemma 0] completes the proof. □. 

5.3. Proof of Corollary [4] 

Before proving Corollary 01 we will need a preliminary result which demonstrates that if r is “very 
close” to E[D], then M” is “very large” for an appropriate range of i. We will then use this result 
to show that cannot be “too close” to E[D] by deriving a contradiction, showing that in this 
case the optimal value would be strictly greater than U, which is impossible. 

Lemma 8. If there exists e G [0,E[D] — Qo] s.t. r G (E[D] — e,E[D]], then for all i,j G 
[AOOpq^ , {porioe-^y] s.t.i>j, 

Mf -Mf > ^porio{A -j^)- {i-j)e-2r]o{log{^) + 2). 

U J 
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Proof of Lemma 0 We note that the result would follow from well-known weak-con vergence 


results under additional assumptions on D (e.g. finite variance, cf. 


Erdos and Kad (IMa)), but to 


avoid unnecessary assumptions (and for completeness) we provide a proof from first principles. 

Let us fix any e € [0,E[Z)] — Qo] > ^ G {^[D] ~ k G [dOOpp (supposing this 

~2 

interval is non-empty, i.e. e < Let > 1} denote an i.i.d. sequence of r.v.s distributed 

as r — D conditioned on the event {r > D}, and {A~’^,i > 1 } denote an i.i.d. sequence of r.v.s 
distributed as D — r conditioned on the event {D > r}. Let denote a binomially distributed r.v. 
with parameters k,pr = P({r > D}), independent of {Af’',i > 1} and {A~’^,i > 1}. It follows from 
definitions and the constraints on e and r that 

Pr e [^Po,Po]- 

Note that for A: > 1, we may construct Wf on an appropriate probability space s.t. Wf = 
TjZi ™ which case (by non-negativity) E[max(0, Wf)] is at least 

Bl - Prk 


E 


BT 


k-Bl 




Bl > Prk -I- (pr(l - Pr)k) ' 


P 


->1 • 


[Pr{l- Pr)k)' 

Furthermore, since pr^[Af'^] = (1 — pr)"&[Af'‘^] — (E[D] — r), it follows from non-negativity and 


independence that 


E 




2=1 


2=1 


Bl>Prk+ [pr{l- Pr)ky 

> {prkP [pr{l - Pr)kY)&^Aty - [{I - Pr)k - (p,. (1 -/3^)/c) ^ ) E[A;'’’'] 

= (p,(l - Pr)k) " (E[^^n + E[^-’n) - k{E[D] - r). 

Let A^fO.ll denote a standard normal r.v. By the celebrated Berry-Esseen Theorem (cf. 


Korolev et al 


feninlB 


Bl - Prk 


[pr{l- Pr)k) 

It is easily verified from dehnitions that 


->1 -P(iV(0,l)>l) 


< 2[pr{l- Pr)k) 


E[Aty+E[Afy>po , {Pril-Pr)Y>Po , P(A^(0,1)>1) > 


10 
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Thus combining the above, we conclude that 


E 


k-Bl 


Y^At'-Y^A- 


Bl > Prk + {pr{l - Pr)k) 


>Poriok^ -ek, 


and 


Bl — Prk \ 1 ^ 1,1 


(15) 


(16) 


{Pr{l-Pr)ky 

As our assumptions on e,r,k ensure that the r.h.s. of both (|15|1 and (|16l) are non-negative, we 
conclude that E[max(0, IT^)] is at least [poPok^ — efc) (A — 2pQ^k~^), which is itself at least 
-^PoVok^ — ke — 2po- Thus by Lemma [5l 




M, - M, = ^ r ^E [ max(0, W;)] 

1=3 

^ i—l .-i 

1=3 1=3 

1 r 1 i 

> Y^PoVo j x~^dx-{i-j)e-2po{\og{-) + 2) 

= \poPo[i^ _(i_j)e_2Tyo(log(-) + 2), 

O J 


i-l 


where we have used the well-known fact that for all re > 1, log(re) < — log(re) -|-2. Combining 

the above completes the proof. □ 

Before completing the proof of Corollary [U it will be useful to collect a few additional auxiliary 
bounds. For a G (0,1), let Ga denote a geometrically distributed r.v. with success probability 1 — a, 
i.e. P(Ga = fe) = (1 — a)a^~^ for fe > 1, independent of ^ k > 1}, and nia = denote a 

median of Ga- Note that the memoryless property implies P(Gq, > 2ma) > Let = 2~^, and 

4 ,2 

4q = 2 . 


Lemma 9. 

(i) .998<eo<l-eo<lo<l- 

(a) L > eg ^ implies e((^Lexp(—egL) < 25. 
(Hi) aG[Cg)^o] implies: 

a , 2ma G [400po^(pg7yge(C^)2]; 


• m, 
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• i(l —a) ^ < rria < 4(1 — a) 

Proof of Lemma 0 

[11 That > .998 follows from the fact that Pq < 1, and thus > 2“3nn > .998. That < 1 — Cq 
follows from the fact that (by definition) €o < 1 “ We now prove that 1 — Cq < ^q. By definition 
Co < jiVoPof, which implies that < Cq. Combining with the exponential inequality and the 

fact that log(2) < 1, we conclude that 

?o>2"^'’>l-log(2 )eo>l-eo, 

completing the proof. As trivially < 1, this completes the demonstration. 

ini It is easily verified that Ci(A) = Lexp(—CoT) is decreasing in L on [cq \ oo). Thus L > Cg ^ implies 

eo^Lexp(-eoT) < eo\i(eo^) = eg ® exp(-eo ^). 

As C 2 (eo) = ^0 ^ increasing in eg on (0 ,|), and by definition eg < i, it follows that 

( 2 (^ 0 ) < C 2 (|) < 25. Combining the above completes the proof. 

(ml The first assertion follows immediately from the dehnitions of ^g,^g, and ma, and a straight¬ 
forward calculation. To prove the second assertion, note that due to (ED, « e [|g,?o] implies 
a € (.998,1). It follows from a straightforward Taylor expansion of the logarithm function that 
a € (.998,1) implies —2(1 — a) < log 2 (Q;) < —(1 — a), and thus |(1 — a)“^ < — ^ (1 “ 0 ,)“^. 

Noting that a G (.998,1) implies [(1 — o;)”^] < 4(1 — a)~^ completes the proof. 

With Lemmas [8] and [9] in hand, we now complete the proof of Corollary [H 
Proof of CoroUary\^ Suppose for contradiction that for some L > Cg ^ + Lg -|- 1, it holds that 
> E[D] — eg. In this case, it follows from Corollary [3l ([T|), and Jensen’s inequality that for all 
ae (i,l), 

00 

OPT(L) > min(6, h) inf ^(1 - a)a’^-^\S + \ - Uo2^°(l - a)-^La^. (17) 




Xin and Goldberg: Asymptotic optimality of TBS policies in dual-sourcing inventory systems 


39 


Note that we may interpret the r.h.s. of (|17l) as an appropriate single-stage newsvendor problem 
(with ordering level S and demand distributed as ) . We conclude from Lemmas [8] and [9l well- 


known results for the newsvendor problem (cf. 
for all aG 


Zipkinl ((20001)), and the memoryless property that 


OPT(L) > min(6,h)E[|M;;^^-M^t|] 

> i min(6, h) - Uo2^°{l - a)-^La^ 

> imin(6,/i)Qpo%((2m„)^ - m2) - eom„ - 27yo(log(2)-k 2)^ -Uo2^°{l - a)~^La^ 

> ■^min{b,h)por]o{2^ -l)mi - eouia - 6r]o - Uo2^°{l - a)~^La^ 

> Co(l — a)~^ — 4eo(l — a)~^ — Gtjq — Uo2^°{l — a)~^La^. 


Setting a = 1 — Cq, and combining the above with Lemma [9l([i]) and the fact that 1 — Cq < exp(—C q), 
we conclude that 

OPT(L) > CoCg ^ - Uo2^°e~^Lexp{-€oL)-eiTjo + l). 

Applying Lemma ([ii|) and the fact that OPT(L) < U, along with a straightforward contradiction 
argument, completes the proof. □. 






